DISTRIBUTIONAL CONVERGENCE FOR THE NUMBER OF 
SYMBOL COMPARISONS USED BY QUICKSELECT 
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'. Abstract 

D 

' , When the search algorithm QuickSelect compares keys during its execution in 

■ order to find a key of target rank, it must operate on the keys' representations or 
internal structures, which were ignored by the previous studies that quantified the 
execution cost for the algorithm in terms of the number of required key comparisons. 

, In this paper, we analyze running costs for the algorithm that take into account 

■ not only the number of key comparisons but also the cost of each key comparison. 
(-H i We suppose that keys are represented as sequences of symbols generated by various 

probabilistic sources and that QuickSelect operates on individual symbols in order 
to find the target key. We identify limiting distributions for the costs and derive 
integral and series expressions for the expectations of the limiting distributions. 
These expressions are used to recapture previously obtained results on the number 
of key comparisons required by the algorithm. 

>: 

I QuickSelect, introduced by Hoare [11] in 1961 and also known as Find or 

■ "Hoare 's selection algorithm" , is a simple search algorithm widely used for finding 
OA I a key (an object drawn from a linearly ordered set) of target rank in a file of keys. 

■ We briefly review the operation of the algorithm. Suppose that there are n keys 
^ I (we will suppose that these are all distinct) and that the target rank is m, where 

1 < m < n. QuickSelect = QuickSelect(n, m) chooses a uniformly random key, 
called the pivot, and compares each other key to it. This determines the rank j (say) 
of the pivot. If j ~ m, then the algorithm returns the pivot key and terminates. If 
^1 j > m, then QuickSelect is applied recursively to find the key of rank m in the 

set of J — 1 keys found to be smaller than the pivot. If j < to, then QuickSelect 
is applied recursively to find the key of rank m — j in the set of n — j keys larger 
than the pivot. 

Many studies have examined this algorithm to quantify its execution costs (a non- 
exhaustive list of references is Knuth [13]; Mahmoud, Modarres, and Smythe |15] : 
Prodinger [18]; Griibel and Rosier [10]; Lent and Mahmoud [M]; Griibel [9]; Mah- 
moud and Smythe [12]; Devroye [S]; Hwang and Tsai [12]; Fill and Nakama [5]; 
and Vallee, Clement, Fill, and Flajolet [23]); and all of them except for Fill and 
Nakama [8] and Vallee et al. [23] have conducted the quantification with regard to 
the number of key comparisons required by the algorithm to achieve its task. As a 
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result, most of the theoretical results on the complexity of QuickSelect are about 
expectations or distributions for the number of required key comparisons. 

However, one can reasonably argue that analyses of QuickSelect in terms of 
the number of key comparisons cannot fully quantify its complexity. For instance, 
if keys are represented as binary strings, then individual bits of the strings must be 
compared in order for QuickSelect to complete its task, and results obtained by 
analyzing the algorithm with respect to the number of bit comparisons required to 
find a target key more accurately refiect actual execution costs. (We will consider 
bit comparisons as an example of symbol comparisons.) When QuickSelect (or 
any other algorithm) compares keys during its execution, it must operate on the 
keys' representations or internal structures, so these should not be ignored in fully 
characterizing the performance of the algorithm. Also, symbol-complexity analysis 
allows us to compare key-based algorithms such as QuickSelect and Quicksort 
with digital algorithms such as those utilizing digital search trees. 

Fill and Janson [7] pioneered symbol-complexity analysis by analyzing the ex- 
pected number of bit comparisons required by Quicksort. They assumed that the 
algorithm is applied to keys that are i.i.d. (independent and identically distributed) 
from the uniform distribution over (0,1) and represented (via their binary expan- 
sions) as binary strings, and that the algorithm operates on individual bits in order 
to do comparisons and find the target key. They found that the expected number 
of bit comparisons required by Quicksort to sort n keys is asymptotically equiv- 
alent to n(lnn)(lgn) (where Ig denotes binary logarithm), whereas the lead-order 
term of the expected number of key comparisons is 2nlnn, smaller by a factor of 
order log n. In their Section 6 they also considered i.i.d. keys drawn from other 
distributions with density on (0, 1). 

By closely following 0, Fill and Nakama [8] studied the expected number of 
bit comparisons required by QuickSelect. More precisely, they treated the case 
of i.i.d. uniform keys represented as binary strings and produced exact expres- 
sions for the expected number of bit comparisons by QuickSelect(ri, m) for gen- 
eral n and m. Their asymptotic results were limited to the algorithms QuickMin, 
QuickMax, and QuickRand. Here QuickMin refers to QuickSelect applied to find 
the smallest key, i.e., to QuickSelect(n, m) with m = 1; and QuickMax similarly 
refers to QuickSelect(n, to) with to = n. QuickRand is the algorithm that re- 
sults from taking to to be uniformly distributed over {1, 2, . . . , n}. They showed 
that the expected number of bit comparisons required by QuickMin or QuickMax 
is asymptotically linear in n with lead-order coefficient approximately equal to 
5.27938. Thus in these cases the expected number of bit comparisons is asymptot- 
ically larger than that of key comparisons required to complete the same task only 
by a constant factor, since the expectation for key comparisons is asymptotically 
2n. Fill and Nakama [8] also found that the expected number of bit comparisons 
required by QuickRand is also asymptotically linear in n (with slope approximately 
8.20731), as for key comparisons (with slope 3). 

Vallee et al. [23] extended the average-case analyses of [7] and [8] to keys repre- 
sented by sequences of general symbols generated by any of a wide variety of sources 
that include memoryless, Markov, and other dynamical sources. They broadly ex- 
tended the results of [5] in another direction as well by treating QuickQuant(n, a) 
for general a e [0, 1], not just QuickMin, QuickMax. and QuickRand. Here the algo- 
rithm QuickQuant(n, a) (for "Quick Quantilc") refers to QuickSelect(n, m„) with 
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mn/n — > a. Roughly summarized, Vallee et al. showed that if symbols are gener- 
ated by a suitably nice source, then the expected number of symbol comparisons 
in processing a file of n keys is of order nlog^n for Quicksort and, for any a, of 
order n for QuickQuant(ri,, a). (For example, all memoryless sources arc suitably 
nice.) For a more detailed discussion of sources and the results of Vallee et al. [23] 
for QuickQuant, sec Section [2j 

The main purpose of this paper is to extend the average-case analysis of Vallee et 
al. [23] by establishing limiting distributions for the number of symbol comparisons. 
To our knowledge the present paper is the first to establish a limiting distribution 
for the number of symbol comparisons required by any key-based algorithm. Our 
elementary approach allows us to handle rather general kinds of "cost" for com- 
paring two keys, and in particular to recover in a rather direct way known results 
about the number of key comparisons. There is no disadvantage to allowing general 
costs, since our results rely on at most broad limitations on the nature of the cost. 

Outline of the paper. We shall be concerned primarily with QuickQuant = 
QuickQuant(n, a), which is what we call the algorithm QuickSelect when applied 
to find the key of rank m„ in a file of size n, where we are given < a < 1 and 
a sequence (m„) such that m„/n — >■ a. It turns out to be convenient mathemat- 
ically to analyze a close cousin to QuickQuant introduced by Vallee et al. [23] . 
namely, QuickVal, and then treat QuickQuant by comparison. So, after a careful 
description of the probabilistic models used to govern the generation of keys in 
Section 12. 1[ a review of known results about key and symbol comparisons in Sec- 
tion [521 a-nd a description of QuickVal in Section [5751 in Section [H we establish 
limiting-distribution results for QuickVal (whose main theorems are Theorem 13.11 
and Theorem 13. 4|) and then move on to QuickQuant in Section [4] (which contains 
Theorem 15.11 the main theorem of this paper) . 

Subsequent to the research leading to the present paper, and using a rather 
different approach, the first author [6] has found a limiting distribution for the 
number of symbol comparisons used by Quicksort for a wide variety of probabilistic 
sources. 

Remark 1.1. Although the contraction method has been used in finding limiting 
distributions for the number of key comparisons required by recursive algorithms 
such as Quicksort (e.g., Rosier [SD], Rosier and Riischendorf [H]), our analysis 
does not depend on it. In examining convergence for the number of key comparisons 
used by QuickQuant, Griibel and Rosier [10] mentioned that they did not use the 
contraction method due to the parameter that represents target rank. (However, 
they did engage in contraction arguments to characterize the limiting distribution.) 
Interestingly, Mahmoud et al. [TS] succeeded in establishing a fixed point equation 
to identify the limiting distributions of the normalized numbers of key comparisons 
required by QuickRand, QuickMin, and QuickMax. Regnier |19| used martingales 
to show convergence for the number of key comparisons required by Quicksort. 

2. Background and preliminaries 

2.1. Probabilistic source models for the keys. In this subsection we describe 
what is meant by a probabilistic source, our model for how the i.i.d. keys are 
generated, using the terminology and notation of Valle et al. [23] . 
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Let S denote a totally ordered alphabet (set of symbols), assumed to be iso- 
morphic either to {0, . . . , r — 1} for some finite r or to the full set of nonnegative 
integers, in either case with the natural order; a word is then an element of 
i.e., an infinite sequence (or "string") of symbols. We will follow the customary 
practice of denoting a word w ~ (uii, u'2, . . .) more simply hy wiW2 ■ ■ ■ ■ 

We will use the word "prefix" in two closely related ways. First, the symbol 
strings belonging to YJ^ are called prefixes of length fc, and so S* := Uo<fc<ooS'^ 
denotes the set of all prefixes of any nonnegative finite length. Second, ii w = 
W1W2 ■ ■ ■ is a word, then we will call 

(2.1) wlk) := wiW2- • -Wk eT,'' 

its prefix of length k. 

Lexicographic order is the linear order (to be denoted in the strict sense by -< 
and in the weak sense by ^) on the set of words specified by declaring that w ~< w' 
if (and only if) for some < k < 00 the prefixes of w and w' of length k are equal 
but Wk+i < We denote the cost of determining w ^ w' when comparing 

distinct words w and w' by c{w,w')] we will always assume that the function c is 
symmetric and nonnegative. 

Example 2.1. Here is an example of a natural class of cost functions. Start with 
nonnegative symmetric functions Ci : E x E — )■ [0, 00), i ~ 1,2,..., modeling the 
cost of comparing symbols in the respective ith positions of two words. This allows 
for the symbol-comparison costs to depend both on the positions of the symbols in 
the words and on the symbols themselves. Then, for comparisons of distinct words, 
define 

k+l k 

(2.2) c{w,w') := ^Ci(wi,w-) = '^Ci{wi,Wi) + Ck+iiwk+i^w'k+i) 

i=l i=l 

where k is the length of the longest common prefix of w and w' . 

(a) If Ci = dig^i (independent of the symbols being compared) for given positive 
integer zq, then c is the cost used in counting comparisons of symbols in position io- 
(For example, if io = 1 then c = 1 is the cost used in counting key comparisons.) 
Observe that all finite linear combinations of such cost functions Sig^. are of the 
form (|2.2p . and therefore, by the Cramer- Wold device (e.g., [H Section 29]), if Sig 
denotes the total number of comparisons of symbols in position iq, then the joint 
distribution of (Si, S2, ■ ■ ■) can (at least in principle) be obtained by studying cost 
functions of the form (|2.2p . 

(b) If Ci = 1 for all i, then c = fc + 1 is the cost used in counting symbol 
comparisons. 

A probabilistic source is simply a stochastic process W = Wi W2 ■ ■ ■ with state 
space E (endowed with its total cr-field) or, equivalently, a random variable W 
taking values in T,°° (with the product cr-field). According to Kolmogorov's con- 
sistency criterion (e.g., [H Theorem 3.3.6]), the distributions fi of such processes 
are in one-to-one correspondence with consistent specifications of finite-dimensional 
marginals, that is, of the probabilities 

Pw := ti-{{wi ■ ■ ■ Wk} X E°°), w = W1W2 ■ • • Wfe e E*. 

Here the fundamental probability Pw is the probability that a word drawn from /i 
has wi ■ ■ - Wk as its length-fc prefix. 
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Because the analysis of QuickSelect is significantly more complicated when its 
input keys are not all distinct, we will restrict attention to probabilistic sources 
with continuous distributions /i. Expressed equivalently in terms of fundamental 
probabilities, our continuity assumption is that for any w — wiW2 ■ ■ ■ 6 we 
have Pw{k) — as fc — !> oo, recalling the prefix notation (|2.ip . 

Example 2.2. We present a few classical examples of sources. For more examples, 
and for further discussion, see Section 3 of [23] , 

(a) In computer science jargon, a memoryless source is one with Wi,W2, ■ ■ ■ 
i.i.d. Then the fundamental probabilities have the product form 

Pw =PwiPw2 ■ ■ -Pwki W = WiW2 ■■■Wk 6 S*. 

(b) A Markov source is one for which W1W2 ■ ■ ■ is a Markov chain. 

(c) An intermittent source over the finite alphabet E = {0, . . . ,r — 1} models 
long-range dependence of the symbols within a key and is defined by specifying the 
conditional distributions C{Wj | Wi, . . . , Wj-i) in a way that pays special attention 
to a particular symbol q_. The source is said to be intermittent of exponent 7 > 
with respect to a if C{Wj | Wi, . . . , Wj-i) depends only on the maximum value k 
such that the last k symbols in the prefix Wi ■ ■ ■ Wj-i are all a and (i) is the uniform 
distribution on E, if /c = 0; and (ii) if 1 < fc < j — 1, assigns mass [k/{k + 1)]''' to a 
and distributes the remaining mass uniformly over the remaining elements of S. 

We next present an equivalent description of probabilistic sources (with a corre- 
sponding equivalent condition for continuity) that will prove convenient because it 
allows us to treat all sources within a uniform framework. If M is any measurable 
mapping from (0, 1) (with its Borel cr-field) into E°° and U is distributed unif(0, 1), 
then M{U) is a probabilistic source. Conversely, given any probability measure /i 
on E°° there exists a monotone measurable mapping M such that M{U) has distri- 
bution pL when U unif(0, 1); here (weakly) monotone means that M{t) ^ M{u) 
whenever t < u. Indeed, if F is the distribution function 

F{w) := fi{w' -.w' diw}, weE"", 

for ^, then we can always use the inverse probability transform 

M{u) := inf{w € E°° : u ^ F{w)}, u G (0, 1) 

for M . The measure /i is continuous if and only if this M is strictly monotone. 

So henceforth we will assume that our keys are generated as M(J7i), . . . , M(?7„), 
where M : (0, 1) — !• E°° is strictly monotone and J7i, . . . , C/„ (we will call these the 
"seeds" of the keys) are i.i.d. unif(0, 1). Given a specification of costs c{w,w') in 
comparing words, we can now define a source-specific notion of cost by setting 

/3(w,t) := c{M{u),M{t)). 

In our main application, Psymh{u,t) represents the number of symbol comparisons 
required to compare words with seeds u and t. 

The following associated terminology and notation from [23| will also prove use- 
ful. For each prefix w G T,* , we let 1^, = («„,, b^) denote the interval that contains 
all seeds whose corresponding words begin with w and fi^ + bw)/^ its mid- 

point. We call lyj the fundamental interval associated with vu. (There is no need 
to be fussy as to whether the interval is open or closed or half-open, because the 
probability that a random seed U takes any particular value is 0. Also, we always 



6 



JAMES ALLEN FILL AND TAKEHIKO NAKAMA 



assume that aw < bw, since the case that = bw will not concern us.) The fun- 
damental probability pw can be expressed The fundamental triangle of 
prefix w, denoted by 7^), is the triangular region 

Tw ■= {iu,t) : a^ < u <t < bw}, 

and when w is the empty prefix wc denote this triangle by T: 

T := {{u,t) : < w < t < 1}. 

For some of our results, the quantity 

(2.3) TTfe := max{p^ -.w^T,^} 

will play an important role. The following definition of a H-tamcd probabilistic 
source is taken (with slight modification) from |23| : 

DcRnition 2.3. Let < 7 < oo and < A < 00. We say that the source is Il-tamed 
(with parameters 7 and A) if the sequence (tt^) at (|2.3p satisfies 

TTfe < A{k + ly for every fc > 0. 

Observe that a Il-tamed source is always continuous. There is a related condition 
for cost functions f3 that will be assumed (for suitable values of the parameters) in 
some of our results: 

DeEnition 2.4. Let < e < 00 and < c < 00. We say that the symmetric cost 
function /3 > is tamed (with parameters e and c) if 

P{u,t) < c{t - uy for aU {u,t) € T. 

We say that (3 is e-tamed if it is tamed with parameters e and c for some c. 

We leave it to the reader to make the simple verification that a source is Il-tamed 
with parameters 7 and A if and only if /?symb is tamed with parameters 6 = 1/7 
and c = A^/'<. 

Remark 2.5. (a) Many common sources have geometric decrease in TTfe (call these 
"g-tamed" ) and so for any 7 are Il-tamed with parameters 7 and A for suitably 
chosen A = A^ [equivalently, the symbol-comparisons cost /3symb is e-tamed for 
any e; in fact, if tt^ < 5^'"' for every k, then 

/3symb(u, t) <1 + logfc for ah (u, t) eT]. 

t — u 

For example, a memoryless source satisfies tt^ = pj^iaxj where 

Pmax := sup Pw 

satisfies Pmax < 1 except in the highly degenerate case of an essentially single- 
symbol alphabet. We also have tt^. < pj^j^x Markov source, where now Pmax 
is the supremum of all one-step transition probabilities, and so such a source is 
g-tamed provided Pmax < 1- Expanding dynamical sources (cf. Clement, Flajolet, 
and Vallee [5]) are also g-tamed. 

(b) For an intermittent source as in Example 12. 2[ for all large k the maximum 
probability tt^ is attained by the word a'' and equals 

TTfc = r^^k^'^ . 
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Intermittent sources are therefore examples of Il-tamed sources for which TTfc decays 
at a truly inverse-polynomial rate, not an exponential rate as in the case of g-tamed 
sources. 

2.2. Known results for the numbers of key and symbol comparisons. In 

this subsection we give for QuickSelect an abbreviated review of what is already 
known about the distribution of the number of key comparisons (/? = 1 in our 
notation) and (from Vallce et al. |23j ) about the expected number of symbol com- 
parisons (/3 = /3symb)- To our knowledge, no other cost functions have previously 
been considered, nor has there been any treatment of the full distribution of the 
number of symbol comparisons. 

Let Kn,m denote the number of key comparisons required by the algorithm to 
find a key of rank to in a file of n keys (with 1 < to < n). Thus Kn,i and Kn^n rep- 
resent the key comparison costs required by QuickMin and QuickMax, respectively. 

(Clearly = if„.„). It has been shown (sec Mahnioud et al. [15], Hwang and 
Tsai |12| ) that as n ^> oo, Kn_i/n converges in law to the Dickman distribution, 
which can be described as the distribution of the perpetuity 

k>l 

where Uk arc i.i.d. uniform(0, 1). Mahmoud et al. |15j established a fixed-point 
equation for the limiting distribution of the normalized (by dividing by n) number 
of key comparisons required by QuickRand and also explicitly identified this limiting 
distribution. 

By using process-convergence techniques, Griibel and Rosier [101 Theorem 8] 
identified, for each < a < 1, a nondegenerate random variable K{a) to which 
Kn,ian\+i/n converges in distribution; see also the fixed-point equation in their 
Theorem 10, and Griibel [9], who used a Markov chain approach and characterized 
the limiting distribution in his Theorem 3. Earlier, Devroye [4] had shown that 

sup max ^{Kn^m > tn) < Cp* 

n>l ' 

for any p > 3/4 and some C = C{p). 

Concerning moments, Griibel and Rosier [lOl Theorem 11] showed that E K{a) = 
2 [1 — a In a — (1 — a) ln(l — a)] and Paulsen [17] calculated higher-order moments of 
K{a). Griibel [HI end of Section 2] proved convergence of the moments for finite n 
to the corresponding moments of the limiting K{a). 

Prior to the present paper, only expectations have been studied for the number of 
symbol comparisons for QuickQuant. The current state of knowledge is summarized 
by part (i) of Theorem 2 in Vallee et al. [23] (see also their accompanying Figures 

I- 3); we refer the reader to [23] for the other parts of the theorem, which routinely 
specialize part (i) to QuickMin, QuickMax, and QuickRand. 

To review their result we need the notation and terminology of Section [2.1l and a 
bit more. Using the non-standard abbreviations y+ := (l/2)-t-y and y~ := (1/2)—?/ 
and the convention OlnO := 0, we define 

.^i-{y+\ny++y-\ny-), if < y < 1/2 
\y-{\ny+ -hi\y-\), if y > 1/2 

and then set L{y) := 2[1 + H{y)]. According to Theorem 2(i) in [23], for any 

II- tamed source the mean number of symbol comparisons for QuickQuEint(n, a) is 
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asymptotically + 0{n^ ^) for some S > 0. Here p = p{a) and 5 both depend on 
the probabilistic source, with 

(2.4) P-Y. 
They derive (|2.4p by first proving the equality 

(2.5) p = J /3{u,t)[{a\/t) - {aAu)]~'^ dudt 
for H-tamcd sources with 7 > 1. 

2.3. QuickQuant and QuickVal. Let S';^ = S^{a) denote the total cost required 
by QuickQuant(n, a). To prove convergence of /n (in suitable senses to be made 
precise later), we exploit an idea introduced by Vallcc et al. [12] and begin with 
the study of a related algorithm, called QuickVal = QuickVal(ri,, a), which wc now 
describe. QuickVal is admittedly somewhat artificial and inefficient; it is important 
to keep in mind that we study it mainly as an aid to studying QuickQuant. 

Having generated n seeds and then n keys Mi, . . . , M„ (say) using our proba- 
bilistic source, QuickVal is a recursive randomized algorithm to find the rank of 
the additional word M{a) in the set {Mi, . . . , Mn, M{a)}; thus, while QuickQuant 
finds the value of the a-quantile in the sample of keys, QuickVal dually finds the 
rank of the population a-quantile in the augmented set. First, QuickVal selects a 
pivot uniformly at random from the set of keys {Mi, . . . , Af„} and finds the rank 
of the pivot by (a) comparing the pivot with each of the other keys (wc will count 
these comparisons) and (b) comparing the pivot with M{a) (we will find it conve- 
nient not to count the cost of this comparison in the total cost). With probability 
one, the pivot key will differ from the word M{a). If M{a) is smaller than the 
pivot key, then the algorithm operates recursively on the set of keys smaller than 
the pivot and determines the rank of the word M{a) in the set Alsmaiior U {M{a)}, 
where A^smaiier denotes the set of keys smaller than the pivot. Similarly, if Al{a) 
is greater than the pivot key, then the algorithm operates recursively on the set of 
keys larger than the pivot [together with the word M{a)]. Eventually the set of 
words on which the algorithm operates reduces to the singleton {M(a)}, and the 
algorithm terminates. 

Notice that the operation of QuickVal is quite close to that of QuickQuant, for 
the same value of a; we expect running costs of the two algorithms to be close, 
since when n is large the rank of M{a) in {Mi, . . . , M„, A/(a)} should be close 
(in relative error terms) to an. In fact, we will show that if = S^{a) denotes 
the total cost of executing QuickVal(ri,, a), then S^/n and S^/n have the same 
limiting distribution, assuming only that the cost function /3 is e-tamed for suitably 
small e. In fact, we will show that when all the random variables 5*^^, S^, . . . and 
SY , S2 , ■ ■ ■ are strategically defined on a common probability space, then S^/n 
and /n both converge in to a common limit for 1 < p < cxd. 

3. Analysis of QuickVal 

Following some preliminaries in Section 13.11 in Section 13.21 we show that for 
1 < p < 00, a. suitably defined /n converges in to a certain random variable S 
(defined at the end of Section 13. 1|) provided only that ES < 00. We also show 
that, when the cost function is suitably tamed, S^/n converges almost surely to 
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S; see Theorem 13.41 in Section 13.31 We derive an integral expression for E S valid 
for a completely general cost function /? in Section 13.41 and use it to compute the 
expectation when /3 = 1. In Section (XSJ wc focus on E with /? = /3symb and derive 
a series expression for the expectation. Few comparisons of results obtained here 
with the known results reviewed in Section 12.21 are made in the present section; 
most such comparisons arc deferred to (the first paragraph of) Section HI where the 
previously-studied algorithm of greater interest, QuickQuant, is treated. 



3.1. Preliminaries. Our goal is to establish a limit, in various senses, for the ratio 
of the total cost required by QuickVal when applied to a file of n keys to n. It 
will be both natural and convenient to define all these total costs, one for each 
value of n, in terms of a single infinite sequence {Ui)i>i of seeds that arc i.i.d. 
uniform(0, 1). Indeed, let Lq := and i?o := 1- For k > 1, inductively define 



(3.1) Tk 

(3.2) Lk 

(3.3) Rk 

(3.4) Sn^k 



= inf{i : L^-i < < Rk-i}, 

= l{Ur, < a)Ur, + l{Ur^ > a)Lk-l, 

= l{Ur, < a)Rk-l + l(C/r, > a)Ur^, 

l{Lk-i <U,< )/?([/„[/, J. 



(Note that Sn.k vanishes if Tk > n.) Wc then claim that, for each n, 
(3.5) ^"'^ 

k>l 

has the distribution of the total cost required by QuickVal(n, a). 

We offer some explanation here. Foreachfc > 1, the random interval (Lfc_i, i?fe_i) 
(whose length decreases monotonically in k) contains both the target seed a and 
the seed Ur^ corresponding to the kih pivot; the interval contains precisely those 
seed values still under consideration after fc — 1 pivots have been performed. The 
only difference between how we have defined and how it is usually defined is 
that we have chosen the initial pivot seed to be the first seed rather than a random 
one, and have made this same change recursively. But our change is permissible 
because of the following basic probabilistic fact: If C/i, . . . , Un, M are independent 
random variables with Ui, . . . , Un i.i.d. uniform(0, 1) and M uniformly distributed 
on {1, . . . , N}, then Um, like Ui, is distributed uniform(0, 1). Thus the conditional 
distribution of Ur^ given {Lk-i, Rk-i) is uniform(Lfc_i, i?fc_i). 

We illustrate our notation for the first two pivots. First, ti = 1; that is, the seed 
of the first pivot is the uniform(0, 1) random variable Ui. After that, ii a < Ui 
then the seed C/t-2 of the second pivot is chosen as the first seed falling in (0, C/i), 
while ii a > Ui then is the first seed falling in (Ui, 1). We note that if a = 
(which means that we are dealing with the total cost required by QuickMin), then 
the first of these two cases is always the one that applies and so for every fc > 1 we 
have Lk = and Rk = Ut^ ; we then have that Ut^. is just the fcth record low value 
among Ui, U2, 
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In order to describe the hmit of S^/n, we let 

I{t,x,y) := I I3{u,t)du, 

(3.6) h I{Ur,,Lk-i,Rk-i), 

(3.7) S := 5]/,. 

fe>i 

Notice that in the case /3 = 1 of key comparisons we have I{t, x,y) = y — x and so 
Ik = Rk-i — Lk-i- 

In Section [5^ we show for 1 < p < oo that /n converges in L'' to S* as n ^ oo 
under proper technical conditions. Under a stronger assumption, we will also prove 
almost sure convergence in Section [3.31 

3.2. Convergence of /n in L'p for \ < p < oo. Theorem 13.11 is our main 
result concerning QuickVal. To state the result, we need the following notation, 
extending that of (|3.6p : 

Ip{t,x,y) := f (3''{u,t)du, 

J X 

(3.8) Ip^k IpiUrkiLk-iiRk-i)-, ■ 
Theorem 3.1. // 1 < p < oo and 

(3.9) ^(E/^,fc)i/f <«), 

fc>i 

then S^/n converges in (and therefore also in probability and in distribution) 
to S as n ^ oo. 

Remark 3.2. For p = 1, notice that the assumption of Theorem 13.11 onlv requires 
that E5 < cxD, which is equivalent to the assertion that X]fe>i -^-^fe < 

Proof. We use || • || to denote L^-norm. As background, we recall that the law 
of large numbers (L^LLN) states that for 1 < p < cx) and i.i.d. random variables 
^1, ^2, ■ . ■ with finite i^'-norm, the sample means f„ = Yl7=i converge in 
to the expectation. To prove this, we may assume with no loss of generality that 
the expectation is 0, and then the bound 

p(|e„r>c)<c-iE|e„r<c-iaii^ 

following from Markov's inequality and the triangle inequality for L^-norm shows 
that the sequence (l^nl^) is uniformly integrable. So the L^'LLN follows from the 
better-known strong law of large numbers. 

Returning to the setting of the theorem, fix k. Conditionally given the quadru- 
ple Ck = {Lk-i, Rk-iiTk^U-r^), the random variables Ui with i > are i.i.d. 
uniform(0, 1). By the i^'LLN we have [using the convention 0/0 = for r^) 
when n = Tk] 



(3.10) E 



Sn.k J 
Ik 



Ck 



■ as n ^> oo 



n - Tk 

since, with U uniformly distributed and independent of all the C/i's, 
(3.11) E[l(ifc_i <U< Rk-i)P{U, Ur,) I Ck] = Ik- 
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For our conditional application of the L^LLN in p.lOp . it is sufficient to assume 
only that the probabilistic source and the cost function /3 > are such that Ip.k is 
a.s. finite, and this clearly holds by (|3.9p . 

Our next goal is to show that the left side of (|3.10p is dominated by a single 
random variable (depending on the fixed value of k) with finite expectation, and 
then we will apply the dominated convergence theorem. For every n, using the 
convexity of for a; > we obtain 



E 



-h 



Tk 



Ck 



< 



E 



Sn.k 
- Tk 



Ck 



We claim that each of the two terms multiplying 2^^^ on the right here is bounded 
by /p,fc. First, using the triangle inequality for conditional LP-norm given Ck, the 
fact that the random variables summed to obtain Sn,k are conditionally i.i.d. given 
Cfc, and the definition p.Sp of Ip^k, we can bound the pth root of the first term by 



E 



< 



Sn,k 
"I - Tk 
1 



Ck 



1/p 



V 

n — ^ — ^ 

i'.Ti^ <ii<n 



{E [liLk-i <U,< Rk-i)l3P{U^, Ur,) I CkW 



i/p 



(3.12) 



= {E [l(L,_i <U< Rk^iWiU, UrJ I Ck]}'/^ = ll'^ 

with U as at p. lip . For the second term we observe that [Ik/{Rk-i — Lk-iW 
is the pth power of the absolute value of a uniform average and so is bounded 
by the corresponding uniform average of absolute values of pth powers, namely, 
Ip,k/{Rk-i - Lk-i); thus 



(3.13) IP < (Rk-i 

So we conclude that 



Lk-iY 



E 






p 


Ck 






n - Tk 







~^Ip.k < Ip,k- 



< 2PL 



p.k • 

Thus it follows from E/p,fc < oo [which follows from p.9p ] and the dominated 
convergence theorem that 

p 



(3.14) 



E 



Sn,k 



Ik 



— as n — > oo. 



n-Tk 

Next, we will show from p.l4p that, for each k, 

Sn.k 



(3.15) 
by proving that 

dn,k 



E 



-Ik 



as n ^ oo 



E 



Sn,k 



Sn 



E 



Tk Sn.k 

n n - Tk 



n - Tk 

vanishes in the limit as ti — oo. Indeed, the corresponding conditional expectation 
given Ck is 

c \ P \ 

^n.k 



l(r.<.)(^)^E[( 



n - Tk 



Ck 



< l(rfc <n) {^yip. 
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recalling the inequality p.l2p . So again using E/p^^ < oo and applying the domi- 
nated convergence theorem we find that dn,k 0, as desired. 

Finally, we show that S^/n converges to S in L^. Since we have termwise 
i^-convergence of jn to S by p.lSp , the triangle inequality for L^-norm and the 
dominated convergence theorem for sums imply that S^ jn converges in LP to S 
provided we can find a summable sequence 6fc such that 



max < sup 

n>l 



5. 



ri.k 



But, for any n > 1, we have [by taking pth powers in p.l2p . then taking expecta- 
tions, then taking pih roots] 



< 



Tk 



< (EIpj 



Further, ||Jfe||p < (EJp,fe)i/P follows from (l3J3l) . Finally, bk := {'EIp^k)^/P is as- 



sumed to be summable. Thus 5^ /n converges to S in L^. 



□ 



Remark 3.3. Letting Kn denote the number of key comparisons required by 
QuickVal(n, a), we find from Theorem 13.11 with (3 = 1 that Kn/n converges in 
LP {l<p<oo) to 



K 



OO 

E 

k=Q 



{Rk - Lk). 



(In Section [3T4l we will explicitly show the required condition that EiC < oo; see 
Remark EH) 

Suppose a = 0; then the number of key comparisons Kn for QuickVal(n, a) is 
the same as for QuickMin. In this case Theorem 13. II gives 

(3.16) ^^K = l + Y.Ur. 

k>l 

for 1 < p < oo. The limiting random variable K has mean 2 and the same so-called 
Dickman distribution as the perpetuity 



(3.17) 



1 + ^ (7i • ■ • f/fc. 



k>l 



That p.l6p ~ p.l7l) holds is well known (e.g., Mahmoud et al. [15], Hwang and 
Tsai [12). 

3.3. Almost Sure Convergence of /n. Under a tameness assumption, we 
can also show that S^/n converges to S almost surely. (Recall Definition [ 



Theorem 3.4. Suppose that the cost (3 is e-tamed for some e < 1/4. Then S^/n 
defined at (jS.Sp converges to S almost surely. 

Before proving this theorem, we establish three lemmas bounding various quan- 
tities of interest. 



Lemma 3.5. For any p > and k > I, we have 
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Here note that for all p > we have 
(3.18) 



2 - 2-'P 

< < 1. 

p+l 



Proof. Fix p > and > 1. Since i?o — = Ij it is sufficient to prove that 



n{Rk-LkY\Lk-uRk-i] < 



2-2- 
p+l 



-{Rk-i-Lk-if. 



Condition on {L^^i, R^-i); then with U uniformly distributed over 
we have the stochastic inequality 

Rk - Lk <st niax{C/ - Lk-i,Rk-i - U}. 

Thus for Lk^i ^ Rk-i, with 

Ak-i ■.^{Lk-i+Rk-i)/2, 

we have 

E[{Rk-Lkr\Lk-i,Rk-i] 

< E[(niax{[/ - - U}y' \ ife^i, 



{Rk-l — Lk-l) 

2-2-P 



A. 



{Rk^i -u)Pdu + 



du 



p+l 



k-l — ^k-1 



as desired. 



□ 



Lemma 3.6. Suppose that the cost /3 is tamed with parameters e and c. Then for 
any interval (a, b) C (0, 1), any t G (a, b), and any < q < we have 



P'i{u,t)du < 



ib~a) 



l-qe 



Proof. Using the tameness assumption, integration immediately gives 



/3«(u, t) du < \{t - a)i-«^ + {b- t)^-«'] . 

1 ~ qe 

The lemma now follows from the concavity of x^"''^ for x > 0. 
The next lemma is a simple consequence of the preceding two. 



□ 



Lemma 3.7. Suppose that the cost /3 is tamed with parameters e < 1 and c. Then 
for any k > 1 and any q > 0, we have 

k-l 



T-c 



2 - 2-9(^-'=) 



1-eJ \q{l-e) + l 



and so E < cxd geometrically quickly. 
Proof. Recalling 



Ik = 



(3{u, Urjdu, 



we find from Lemma 



that 



Ik < 



1-e 



{Rk-l — Lk-l)^ ^- 
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By application of Lemma 13.51 we thus obtain the desired bound on E I\ 
series-convergence assertion follows from the observation p.lSp . 



I The 
□ 



Now we prove Theorem [ 
Proof of Theorem \3.Ji\ Clearly it suffices to show that 



(3.19) 

and 

(3.20) 
where 



n 



n 







Sn := - Tk)'^Ik- 

k>l 



We tackle first and then ([XTO)) . 

By the monotone convergence theorem, Sn/n t S almost surely. But from 
Lemma 13.71 (using only e < 1) we have E S = X]fc>i E < oo, which implies that 
S < oo almost surely. Hence p.20p follows. 

Our proof of p.l9p both is inspired by and follows along the same lines as the 
"fourth-moment proof" of the strong law of large numbers described in Ross 
Chapter 8]; as in that proof, we prefer easy calculations involving fourth moments 
to more difficult ones involving tail probabilities — perhaps with the expense that 
the value 1/4 in the statement of Theorem l3. 41 could be raised by more sophisticated 
arguments. For p.l9p it suffices to show that, for any (5 > 0, 



Sn 



> 6 i.o. = 0, 



for which it is sufficient by the first Borel-Cantelli lemma and Markov's inequality 
to show that 

0„ On 



(3.21) 



EE 

n>l 



< OO. 



n 



n 



Here, by the triangle inequality for the L norm. 

4 

< 




(3.22) 



E 

n>l 



E 



E 

fc>i 



Sn 



(n - Tfc)+ 
-Ik 



n 



E 

fc>i 



{n - TkY 



n - Tk 



where we again use the convention 0/0 = for Sn.k/in — Tk) when n = Tk- As 
in the proof of Theorem 13. 1[ we let Ck denote the quadruple (Lk-i, Rk-i, Tk, Ur^). 
Also we define 



and 



= liLk-i<U <Rk-i)m.Ur,). 
Mm{k) :=E[(/fc-4)'"|Cfc], 
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where U is unif(0, 1) and independent of Ck- Then routine calculation (see Ross [221 
Section 8.4]) shows that 



E 



{n - TkY 



Sn 



n-Tk 



-Ik 



E 



E 



[n - Tk)+ f Sn,k 



n-Tk 



-Ik 



Ck 



E. 



(n - rfe)^ 



(n - Tk)+Ah{k) + 3(n - Tk)+ (n - Tk - l)+ Ml{k) 



[(n-rfc)+]4 

< E {n-^ [nMi{k) + 3n(n - l)Mi{k)]) < 3n~^ EM4(fc), 
(3.23) 

where the first inequality holds because Af4(fc) > M|(fc). 

We will show that EA/4(fc) decays geometrically and then use that fact to 
prove p.21|) . Since (a — fo)"' < 8(a'* + fo"') for any real a and b, we have 

(3.24) Mi{k) < 8(E[/^|Cfc]+/i 

First, using Lemma 13.71 we find (using only e < 1) that E/^ < oo decays geometri- 
cally: 



(3.25) 



Elt < 



2'c 
1 - e 



2-2-4(1--=) 
5 - 4e 



k-l 



Now we analyze, in similar fashion, E[/^|Cfc] in p.24p . Using the assumption 
< e < 1/4 and Lemma \JM we find 



E 



Ck 



< 



l-4e 



k-l 



Lk-i) 



l-4e 



Applying Lemma 13.51 thus gives the geometric decay 



(3.26) 



E/? < 



24'c4 /2-2-(i-4^) 



1 -4e 



2-4e 



k~l 



Therefore, it follows from (jX^ - liX^ and ^?^-^^ that (jX^ holds: 

4 



EE 



< 



At. 



J2{EAUk)) 



1/4 



fe>l 



< oo. 



This completes the proof of Theorem 13.41 



□ 



3.4. Computation of E S: an integral expression. In this section we derive the 
following simple double-integral expression for E S* in terms of the cost function /3. 

Theorem 3.8. For any symmetric cost function /3 > we have 



E5 



I3{u,t) [{a V t) - (a A u)]"^ dudt. 



'0<M<t<l 

Proof. Recall that E S* = X]fc>i ^ 4-, where 



Ik 



I3{u, Urjdu. 
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Recall also that, for each fc, the conditional distribution of Ut^ given Lk-i and 
i?fe_i is uniform(Lfc_i, Thus 

Elk = 'E {Rk-i - Lk-i)-^ I3{u,w)dwdu 

= I l3{w,u)'E[{Rk-i - Lk^i)-^l{Lk^i <u,w < Rk-i)]dwdu 

Jo<w,u<l 

«<U<1 

/ [y — a;)~^l(a; < w < u < y)P{Lk-i G dx, Rk-i £ dy) dw du. 

Jo<x<a<v<l 



Hence 



(3.27) ES ^ 2 [ P{w,u) 

JO<w<u<l 

{y — x)~^\{x < w < u < y) i^{dx, dy) dw du 



/0<iu<ii<l 

X / [y-xy^ 

Jo<x<a<y<l 

where is the measure 

(3.28) iy{dx, dy) := ^ P(Lfc e dx, Rk G dy). 

k>0 

As established in the Appendix in Proposition lA.li one has the tractable expres- 
sion 

iy{dx, dy) = So{dx) 6i (dy) + (1 — x)~^ dx Si (dy) + 6o{dx) y~^ dy + 2{y — x)~'^dx dy. 
Using this last expression, routine calculation shows that, for < w < u < 1, 



(3.29) / {y - x)^'^\{x <w <u <y)v(dx,dy) = [{ay u) ~ [a Aw)]~'^. 

J 0<x<a<y<l 

Substitute (|3.29p into p.27p to complete the proof of the theorem. □ 

Remark 3.9. We now let /3 = 1 and use Theorem l3.8l to analyze the expectation of 
the number Kn of key comparisons required by QuickVal(rt, a). Then the expected 
value in Theorem 13.81 is 



(3.30)2/ / [{ayt)-{aAu)Y^ dudt^2[l-a\na-{l-a)\Ti{l-a)]<oo. 

J Jo<u<t<l 

It follows by (|3.30p that for a = wc have 

lim E Kn/n = 2, 

which is well known since Kn in this case represents the number of key comparisons 
requred by QuickMin applied to a file of n keys (e.g., Mahmoud et al. [H]). Thus 
we are now able to conclude that for any a (0 < a < 1), EKn/n converges to 
the simple constant in (|3.30p . Also notice that we have verified the hypothesis of 
Theorem 13.11 for p = \ (see also 13. 2p by (|3.30p , as we promised in Remark 13.31 that 
we would. 
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3.5. Computation of ES*: a series expression. We now restrict to the cost 
function /3symb and use Theorem 13.81 to derive a series expression for E 5. In the 
notation of Section 12. 1[ we have 

iE5 = X] / [{aV t) - {a Au)]-'^ dudt, 
which is easily obtained by noting that for u <t we have 



(3.31) 
Define 



l(at„ <u <t < byj). 



J{w):= J [{a\/ t) - {a Au)]-'^ dudt. 
Then routine calculation shows that 

J{W) = PyjL 

Thus 

(3.32) ES* = X p^L 



Pu 



PtL 



This last equation is in agreement with Theorem 2(i) of Vallec et al. [23] (see also 
their Figure 1). But, unlike in [23j . our calculation requires no assumptions of 
tameness, nor even that E S* is finite. 



4. Analysis of QuickQuant 

Following some preliminaries in Section |4?T1 in Section [5] we show that a suitably 
defined 5*^ /n converges in L'' to S' for 1 < p < oo provided that the cost function /3 
is e-tamed with e < 1 /p; hence / n and / n have the same limiting distribution 
provided only that the cost function (3 is e-tamed for suitably small e. Granting that 
result for a moment, we can now relate three of the results obtained in Section [3] to 
previously known results reviewed in Section [2.21 From Remark 13. 31 we recover the 
result of |10| Theorem 8] (in a cosmetically different, but equivalent, form; compare 
O Theorem 3]) for the limiting distribution of the number of key comparisons, 
and from Remark 13.91 we recover first-moment information for the same. Finally, 
recalling that L^-convergence implies convergence of means, from p.32p we recover 
at least the lead-order term in the asymptotics of [23] discussed at (|2.4p . 

4.1. Preliminaries. We will closely follow the framework described in Section [3] 
for the analysis of QuickVal and construct a random variable, call it S^, that has 
the distribution of the total cost required by QuickQuant when applied to a file of n 
keys. Our goal is to show that, under suitable technical conditions, S^/n converges 
in LP to S defined at ([3J]) . 

Again, we define in terms of an infinite sequence {Ui)i>i of seeds that are i.i.d. 
uniform(0, 1). Let rUn (with m„/n a) denote our target rank for QuickQuant. 
Let Tk{n) denote the index of the seed that corresponds to the fcth pivot. As in 
Section 13.11 we will set the first pivot index ri (n) to 1 rather than to a randomly 
chosen integer from {l,...,n}. For A; > 1, we will use Lk-iiji) and Rk-i{n), as 
defined below, to denote the lower and upper bounds, respectively, of seeds of words 
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that are ehgible to be compared with tlic fcth pivot. [Notice that Tk{n), Lk{n), and 
Rk[n) are analogous to Tfc, ife, and Rk defined in Section [3J| see p.ip - p.Sp .] 
Hence we let Loin) := and i?o(n) := 1, and for fc > 1 we inductively define 

Tk{n) inf{i < n : Lk-i{n) < Ui < Rk-iin)}, 

and 

Lk{n) := l(pivrankj,(n) < m„) ^/^^(^i) + 1 (pivrank^. (n) > m„) 
Rk{n) := l(pivrankj.(n) > m„) Ur^(^n) + 1 (pivrank;, (n) < m„) Rk-i{n) 
if Tk{n) < oo but 

(Lfc(n),i?fe(n)) := (ifc_i(n), 
if Tfe(n) = oo. Here pivrankj.(n) denotes the rank of the fcth pivot seed C/rfc(n) if 
Tk(n) < oo and to„ otherwise. Recall that the infimum of the empty set is oo; hence 
Tk{n) = oo if and only if Lk-i{n) = Rk-i{n). 
Using this notation, let 

Sn,k ■■= E HLkMn) <U,< Rk^i{n))m. Ur,(n)) 

i: Tfc (n) <'i<n 

be the total cost of all comparisons (for the first n keys) with the fcth pivot key. 
Then 

(4-1) SS:=J2Sn,k 

k>l 

has the distribution of the total cost required by QuickQuant. 

Notice that the expression (|4.ip is analogous to p.Sp . In fact, we will prove 
the L^-convergence of jn to S by comparing the corresponding expressions for 
QuickVal and QuickQuant. 

5. Convergence of S^jn in W for 1 < p < oo 

The following is our main theorem regarding QuickQuant. 

Theorem 5.1. Let 1 < p < oo. Suppose that the cost function P is e-tamed with 
e < 1/p. Then S^/n converges in to S . 

Remark 5.2. Note that as p increases, getting i^-convergence requires the in- 
creasingly stronger condition e < 1 /p. Thus we have convergence of moments of all 
orders provided the source is 7-tamcd for every 7 > — for example, if it is g-tamed 
as in Remark 12. 5[ as is true for mcmoryless and most Markov sources. 

The proof of Theorem 15 . 1 1 will make use of the following analogue of Lemma [331 
whose proof is essentially the same and therefore omitted. 

Lemma 5.3. For any p > Q and fc > 1 and n > 1, we have 

E{Rk{n) ~ Lk{n))P < 

Proof of Theorem 15. jl Part of our strategy in proving this theorem is to compare 
QuickQuant with QuickVal. Hence we will frequently refer to the notation estab- 
lished in Section 13.11 for the analysis of QuickVal. For each fc, observe that as 
n — >■ 00 we have 

Tk{n) ^ Tk, [/rfc(n) ^ Ut^, Lk{n) ^ Lk, Rk{n) ^ Rk, 



2 - 
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where Tfc, Lfc, and Rk, are defined in Section [3TT] [see p.ip - p.3p ]. (In fact, in each 
of these four cases of convergence, the left-hand side ahuost surely becomes equal 
to its limit for all sufficiently large n.) Thus for each fc > 1 we have 



St 



SnM ^ 0, 



(5-1) 

where Sn,k is defined at p.4p : indeed, again the difference almost surely vanishes 
for all sufiiciently large n. In proving Theorem 13.11 we showed [at (|3.15|) ] that 

Sn,k 



L" 



Ik 



where Ik is defined at (|3.6|) . and it is somewhat easier (by means of conditional 
application of the strong law of large numbers, rather than the law of large 
numbers, together with Fubini's theorem) to show that 

(5.2) ^ ^ h. 

n 

Combining (j5.ip and (j5.2p . for each k > 1 we have 



(5.3) 



s: 



n.k a.s. 



n 



(5.4) 



What we want to show is that 

S3 LP, 

n ^—^ n 



S. 



k>l 



k>l 



Choose any sequence {ak)k>i of positive numbers summing to 1, and let A be the 
probability measure on the positive integers with this probability mass function. 
Then, once again using the fact that the pth power of the absolute value of an 
average is bounded by the average of pth powers of absolute values. 



n 



< 



E 

A;>1 



s: 



Q 



k>l 



St 



< 



s: 



n,k 



k>l 

So for (j5.4p it suffices to prove that, with respect to the product probability P x ^, 
as n — >■ CO the sequence 



S 



n,k 



P 



converges in to 0. What we know from (|5.3p is that the sequence converges 
almost surely with respect to P x A. 

Now almost sure convergence together with boundedness in L^'^^ are, for any 
(5 > 0, sufficient for convergence in because the boundedness condition implies 
uniform integrability (e.g., Chung [21 Exercise 4.5.8]). Thus our proof is reduced to 
showing that, for some q > p, the sequence 

9 



k>l 



E 



s: 



n.k 



Ik 
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is bounded in n, for a suitably chosen probability mass function (ofc). Indeed, by 
convexity of qth power, 



(5.5) 



k>l 



St 



A;>1 



St 



fc>i 



and we will show that each sum on the right-hand side of (|5.5p is bounded in order to 
prove the theorem. The value of q that we use can be any satisfying e < l/q < l/p. 
First we recall from Lemma 13.71 that 



(5.6) 



2'c 



2 - 2"'?(i-') 
g(i - e) + 1 



fc-i 



A: > 1, 



with geometric decay. Thus the second sum on the right in (|5.5p is finite if the cost 
is e-tamcd with e < 1 and the sequence (uk) is suitably chosen not to decay too 
quickly. 

Next we analyze Ei\S'^ i^/n\'^ for the first sum on the right in (|5.5|) . Let 

Vk-i{n) \{i : Lk-i{n) < U., < Rk-i{n), Tk{n) < i < n}\. 

Until further notice our calculations arc done only over the event {vk-iin) > 0}. 
Then, bounding the qth power of the absolute value of an average by the average 
of qth powers of absolute values. 



E l(rfc(n) < I <n)/3(t/„ [/,,(„)) 

i)<Ui<B.k-i{n) 

E l(rfc(n)<*<n) /?'?([/„ t/,,(„)) 

)<C/i<i?,fc_i(n) 

Vfe_i(ri)^ 







1 


1 




n 






(5.7) 


< 


1 

I. L,f^_ 



Let Dk{n) denote the quintuple (ife_i(n), Rk-i{n), Tk{n), [/t-^(„), i>k-i{n)), and no- 
tice that, conditionally given Dk{n), the i^k-iin) values Ui appearing in (j5.7p are 
i.i.d. umt{Lk-i{n), Rk-i{n)). Using (|5.7p . we bound the conditional expectation of 
\s2^/n\'^ given Dk{n). We have 







^n,k 


1 




E 








Dk{n) 






n 





< [Rk^iin) - Lk-i{n)]- 



(5.8) 



i'fc-i(n-) V 



du 



Under e-tameness of f3 with e < l/q, we find from Lemma 13.61 that 



(5.9) 



' Lk-i{n) 



P'i{u,Ur,in))du<- [Rk-i{n) - Lk-i{n)] 

1 — 
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From (IEH1)-(IE1]), it follows that if e < l/q, then 

9 

^ Dk{n) 



E 



< [Rk-i{n) - Lk^n)]"-''' 



Vk-i{n) 



n{Rk-i{n) - Lk-i{n)) 



Until this point we have worked only over the event {vk-i{n-) > 0}i but now we 
enlarge our scope to the event {Lfc_i(7i) < Rk-iin)} and note that the preceding 
inequality holds there, as well. 

Next notice that, conditionally given the triple 

Dkin) := (Lfe„i(n),i?fe„i(n),Tfc(n)), 

the values Ui with Tk{n) < i < n arc i.i.d. unif(0, 1), and so the number of them 
falling in the interval (Lfc_i(ri), is distributed binomial(m, t) with m — 

n — Tk{n) and t = Rk-i{n) — Lk-i{n), and (representing a binomial as a sum of 
independent Bernoulli random variables and applying the triangle inequality for 
i') moment of order q bounded by m''t. Thus 

Vk-i{n) ^ 



E 



_ \n{Rk-i{n) - Lk-i{n)) 



Dk{n) 



< [Rk-i{n) - L^^iin)] 



1-9 



SO that 



E 



c-Q 

'-'71, k 


9 


Dkin) 


n 







09e^9 

< ■[Rk-i{n)-Lk-i{n)]'-'^'. 



1 



Since this inequality holds even when Lk-i{n) ~ Rk-i{n), we can take expectations 
to conclude 



E 



(5.10) 



< 



< 



E[i?fe_i(n)-Lfc_i(n)]i-''^ 
2~qe 



where at the second inequality we have employed Lemma [ 

From (|5.6p and (|5.10|) we see that we can choose (ofc) to be the geometric distri- 
bution flfe = (1 - 6)9''-^, k > 1, with 

2 — 9^9(l-e) 

< 61 < 1. 



9(1 



1 



k>l "fc 



is bounded in n. and there- 



We then conclude that J2 

fore that S'^ jn converges to 5' in L^, if the cost function is e-tamcd with e < 1/p. □ 
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A. Appendix: A tractable expression for the measure v 

The purpose of this appendix is to prove the following proposition used in the 
computation of E 5* in Section 13.41 

Proposition A.l. With {Lk,Rk) defined at (|3.2p - (|3.3p as the interval of values 
eligible to be compared with the kth pivot chosen by QuickVal, and with 

v(dx, dy) := ^ P(ifc G dx, Rk £ dy) 

k>0 

as defined at p.28p . we have 

^{dx, dy) = SQ{dx) Si (dy) + (1 — x)^^ dx 5i{dy) + 5Q{dx) y^^ dy + 2{y — x)^^dx dy. 

Proof. To begin, since Lq := and Rq := 1 we have 

(A.l) P{Lq e dx,Ro e dy) = Sa{dx) 5i{dy), 

where Sz denotes the probability measure concentrated at z. Now assume fc > 1. 
IfO<A<Q;<p<l, then 

P{Lk e dx, Rk €dy\ Lk-i = A, Rk-i = p) 

= 5p{dy)l{X < X < a){p - A)"^ dx + 6xidx)l{a <y < p){p - A)"^ dy. 

Hence 

(A.2) P{Lkedx,Rkedy)^ j[5p{dy)l{X < x < a){p - X)-^dx 

+5x{dx)l{a <y< p){p - X)-^dy] P(ife_i £ dX, Rk-i £ dp). 
We can infer [and inductively prove using (|A.2p ] that, for k > 1, 
(A.3) P(Lfe edx.Rk e dy) ^ Si{dy)fkix)dx + doidx)gkiy)dy + hk{x,y)dx dy, 
where 

= 1(0 < a; < a), gi{y) ^ l{a < y < 1), hi{x,y) = 0, 

and, for fc > 2, 



(A.4) fkix) = l(0<a:<a)y 1(0< A<a:)(l-A)-Vfc_i(A)dA, 
(A.5) gkiy) - l{a < y < 1) [ l{y < p < l)p-'gk-iip) dp, 



(A.6) hk{x,y) - I{0<x<a<y<l)[{l-x)-^fk-i{x)+y-^gk-i{y) 

+ Jl{0<X<x){y-X)-'hk-iiX,y)dX 
+ l{y < p<l){p-x)-^hk-i{x,p)dp . 



Henceforth suppose 0<a;<a<y<l. From (jA.5|) we obtain 

(A.7) g,(y) = i_^l_, fc>i, 

whence 

(A.8) ^gk{y)=y-'. 

k>l 
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By recognizing symmetry between (jA.4p and (jA.sp . we also find 

[-ln(l -a;)]'=-i 



(A.9) 

and so 
(A.IO) 



fkix) 



(fc-1)! ' 



fc > 1, 



A:>1 



In order to compute J2k>i ^kix, y), we consider the generating function 



H{x, y, z) := ^ hk{x, y) z'' . 



k>l 



fc>i 



fe>i 



+ / iy~X)-^HiX,y,z)dX+ / {p ~ x)-' H{x, p, z) dp 



(A.11) 
From (|X6l) . 
H{x,y,z) 

(A.12) 



Using this integral equation, we will show via a series of lemmas culminating in 
Lemma lA.lOl that 

(A.13) H{x,y) ■.= H{x,y,l)^Y,hk{x,y) equals 2(y ~ x) -2. 

k>l 

Combining equations (|A.3[) . (|A.8I) . (jA.10[) . and (|A.13|) . we obtain the desired ex- 
pression for v. □ 

Throughout the remainder of this appendix, whenever we refer to H(x, y) we 
tacitly suppose that 0<x<a<y<l. 

Lemma A. 2. H{x, y) < 00 almost everywhere. 



Proof. We revisit Remarks 13 . 31 and 13 .91 and consider the number of key comparisons 
required by QuickVal(ri,, a). As shown at p.30p . we have ES" < cxo in this case. 
On the other hand, with /3 = 1, from (|3:27)) -(|3:28 | . (|XT|) . (|X3|) . and ((X8l) - (|XT0| . 

we have 



1 



y ^ 1(2/ > u)dy 



(1 - x) ^ l{x <w)dx + 

0<w<u<l L JO<x<a J a<y<l 

+ / {y ~ x)~^ l{x < w < u < y) H{x,y) dx dy dwdu. 

J a<x<a<y<l 

Thus H{x,y) < 00 almost everywhere. □ 

The next lemma establishes monotonicity properties of H{x,y). 

Lemma A. 3. H{x, y) is increasing in x and decreasing in y. 

Proof. For each fc > 1, we see from (jA.9p that fk{x) is increasing in x and from (jA.7[) 
that gk{y) is decreasing in y. Since hi = 0, it follows by induction on fc from (jA.6[) 
that hk{x,y) is increasing in x and decreasing in y for each fc. Thus H[x,y) = 
X]fc>i f^k{x, y) enjoys the same monotonicity properties. □ 

Lemma A. 4. H{;x, y) < 00 for all x and y. 
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Proof. This is immediate from Lemmas IXMO □ 

Lemma A. 5. The generating function H{x,y,z) at (jA.ll[) . is [with Hq := 0) the 
unique power-series solution H{x,y,z) = '^i^yohk{x,y)z'' {in < z < 1) to the 
integral equation (jA.12p such that < hk{x,y) < hk{x,y) for all k,x,y. 

Proof. Wc have already seen that H is such a sohition. Conversely, if H is such 
a solution, then equating coefficients of z'^ in the integral equation [which is valid 
because we know by Lemma [Alll that H{x, y, z), and hence also H{x, y, z), is finite 
for < z < 1] we find that the functions hk{x,y) satisfy h^ = for fc = 0, 1 
and the recurrence relation (|A.6I) for k > 2. It then follows by induction that 
hkix,y) = hkix,y) for all k,x,y. □ 

Next we let Hq{x, y, z) := and, for < z < 1, inductively define Hn{x, y, z) by 
applying successive substitutions to the integral equation (|A.12I) : that is, for each 
n > 1 we define 



Hn{x,y,z) 



fc>i fc>i 



(A.14) + / {y~\)-^Hn-i{\y,z)d\+ / {p-x)~^Hn-i{x,p,z)dp 



Let [z^] Hn{x, y, z) denote the coefficient of z'' in Hn{x, y, z). 

Lemma A. 6. For each k > 1, [z*^] Hn{x, y, z) is nondecreasing in n > 0. 

Proof. The inequality [z*^] Hn{x, y, z) > [z'^] Hn^i{x, y, z) is proved easily by induc- 
tion on 71 > 1. □ 

According to the next lemma, H dominates each H„. 

Lemma A. 7. For all n > and k > 1 we have 

(A.15) < [z'']H„{x,y,z) < hk{x,y,z). 

Proof. Lemma [A. 61 establishes the first inequality, and the second is proved easily 
by induction on n. □ 

Lemmas IA.5HA.7] lead to the following lemma: 
Lemma A. 8. For {)<x<a<y<l and < z < 1 we have 
Hn{x,y,z) ^ H{x,y,z) as ri t oo. 

Proof. Recalling Lemmas IA.6l - IA.7l define H{x, y, z) to be the power series in z 
with coefficient of z^ equal to hk{x,y) := lim„-^oo[^'^] ^^n(a;, y, z), which satisfies 
< hk{x, y) < hk{x, y). On the other hand, H satisfies the integral equation (jA.12[) 
by applying the monotone convergence theorem to (|A.14I) . Thus it follows from 
Lemma lA.SI that H ~ H . Finally, another application of the monotone convergence 
theorem shows that H{x, y, z) = lim„^oo Hn{x, y,z). □ 

Our next lemma, when combined with the preceding one, immediately leads to 
inequality in one direction in (jA.13p . 
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Lemma A. 9. For 0<x<a<y<l and all n > 0, 

Hr,{x,y,l)<2{y~x)-^. 
Proof. We will prove this lemma by induction on n, starting with 

Ho{x,y)=0< 2{y~x)~\ 



Suppose that the claim holds for n - 1. Then from (|A.14p . (|A.8|) . and (|A.10[) we 
have 



□ 



Finally we are ready to prove (|A.13p . 
Lemma A. 10. For 0<x<a<y<l, 

Hix,y,l)^2iy-xy^. 

Proof. Define 

H{x,y) ■.= 2{y- x)-^ -H[x,y). 

Then to prove the desired equality it suffices to show that for any integer r > we 
have 



(A.16) 



Q<H{x,y) < (|r x2(y-x)-3. 



As remarked earlier, the nonnegativity of H follows from Lemmas IA.8HA.9l We 
prove the upper bound on H in (|A.16|) by induction on r. The bound clearly holds 
for r = 0. Notice that by substituting z ~ 1 and H{x,y) ~ 2{y — x)^'^ — H{x,y) 
into the integral equation (|A.12p we find 

Hix,y) = 2{y~x)-^ -H{x,y) 

= 2{y - - I (1 - x)-^ + + j\y - \)-' [2(y - A)-^ - H{X, y)] d\ 

+ ljp-xy'[2{p- xy' - H{x, p)] dp I 

{y-X)-^H{X,y)dX+ / {p - x)-^ H{x, p) dp. 
la Jy 

Thus if we assume that the upper bound in (|A.16I) holds for r — 1, then 



Hix,y) < 



X 2 



iy-X)-'dX+ / {p- xydp 
Jy 



< arx2{y-x)-'. 



Hence (|A.16p holds for any nonnegative integer r. 



□ 
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